Search CORE

35 research outputs found

Context-aware Human Motion Prediction

Author: Alenyà Guillem
Corona Enric
Moreno-Noguer Francesc
Pumarola Albert
Publication venue
Publication date: 01/01/2020
Field of study

The problem of predicting human motion given a sequence of past observations is at the core of many applications in robotics and computer vision. Current state-of-the-art formulate this problem as a sequence-to-sequence task, in which a historical of 3D skeletons feeds a Recurrent Neural Network (RNN) that predicts future movements, typically in the order of 1 to 2 seconds. However, one aspect that has been obviated so far, is the fact that human motion is inherently driven by interactions with objects and/or other humans in the environment. In this paper, we explore this scenario using a novel context-aware motion prediction architecture. We use a semantic-graph model where the nodes parameterize the human and objects in the scene and the edges their mutual interactions. These interactions are iteratively learned through a graph attention layer, fed with the past observations, which now include both object and human body motions. Once this semantic graph is learned, we inject it to a standard RNN to predict future movements of the human/s and object/s. We consider two variants of our architecture, either freezing the contextual interactions in the future of updating them. A thorough evaluation in the "Whole-Body Human Motion Database" shows that in both cases, our context-aware networks clearly outperform baselines in which the context information is not considered.Comment: Accepted at CVPR2

arXiv.org e-Print Archive

Crossref

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC

Robot-aided cloth classification using depth information and CNNs

Author: Alenyà Ribas Guillem
Corona Puyane Enric
Gabas Nova Antonio
Torras Carme
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The final publication is available at link.springer.comWe present a system to deal with the problem of classifying garments from a pile of clothes. This system uses a robot arm to extract a garment and show it to a depth camera. Using only depth images of a partial view of the garment as input, a deep convolutional neural network has been trained to classify different types of garments. The robot can rotate the garment along the vertical axis in order to provide different views of the garment to enlarge the prediction confidence and avoid confusions. In addition to obtaining very high classification scores, compared to previous approaches to cloth classification that match the sensed data against a database, our system provides a fast and occlusion-robust solution to the problem.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Learned Vertex Descent: A New Direction for 3D Human Model Fitting

Author: Alenyà Guillem
Corona Enric
Moreno-Noguer Francesc
Pons-Moll Gerard
Publication venue
Publication date: 01/01/2022
Field of study

We propose a novel optimization-based paradigm for 3D human model fitting on images and scans. In contrast to existing approaches that directly regress the parameters of a low-dimensional statistical body model (e.g. SMPL) from input images, we train an ensemble of per-vertex neural fields network. The network predicts, in a distributed manner, the vertex descent direction towards the ground truth, based on neural features extracted at the current vertex projection. At inference, we employ this network, dubbed LVD, within a gradient-descent optimization pipeline until its convergence, which typically occurs in a fraction of a second even when initializing all vertices into a single point. An exhaustive evaluation demonstrates that our approach is able to capture the underlying body of clothed people with very different body shapes, achieving a significant improvement compared to state-of-the-art. LVD is also applicable to 3D model fitting of humans and hands, for which we show a significant improvement to the SOTA with a much simpler and faster method.Comment: Project page: https://www.iri.upc.edu/people/ecorona/lvd

arXiv.org e-Print Archive

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC

PI-Net: Pose Interacting Network for Multi-Person Monocular 3D Pose Estimation

Author: Alameda-Pineda Xavier
Corona Puyane Enric
Guo Wen
Moreno-Noguer Francesc
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

UPCommons. Portal del coneixement obert de la UPC

D-NeRF: neural radiance fields for dynamic scenes

Author: Corona Puyane Enric
Moreno-Noguer Francesc
Pons-Moll Gerard
Pumarola Peris Albert
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting /republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksNeural rendering techniques combining machine learning with geometric reasoning have arisen as one of the most promising approaches for synthesizing novel views of a scene from a sparse set of images. Among these, stands out the Neural radiance fields (NeRF), which trains a deep network to map 5D input coordinates (representing spatial location and viewing direction) into a volume density and view-dependent emitted radiance. However, despite achieving an unprecedented level of photorealism on the generated images, NeRF is only applicable to static scenes, where the same spatial location can be queried from different images. In this paper we introduce D-NeRF, a method that extends neural radiance fields to a dynamic domain, allowing to reconstruct and render novel images of objects under rigid and non-rigid motions. For this purpose we consider time as an additional input to the system, and split the learning process in two main stages: one that encodes the scene into a canonical space and another that maps this canonical representation into the deformed scene at a particular time. Both mappings are learned using fully-connected networks. Once the networks are trained, D-NeRF can render novel images, controlling both the camera view and the time variable, and thus, the object movement. We demonstrate the effectiveness of our approach on scenes with objects under rigid, articulated and non-rigid motions.This work is supported in part by a Google Daydream Research award and by the Spanish government with the project HuMoUR TIN2017-90086-R, the ERA-Net Chistera project IPALM PCI2019-103386 and María de Maeztu Seal of Excellence MDM-2016- 0656. Gerard Pons-Moll is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - 409792180 (Emmy Noether Programme, project: Real Virtual Humans).Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

SMPLicit: Topology-aware Generative Model for Clothed People

Author: Alenyà Guillem
Corona Enric
Moreno-Noguer Francesc
Pons-Moll Gerard
Pumarola Albert
Publication venue
Publication date: 01/01/2021
Field of study

In this paper we introduce SMPLicit, a novel generative model to jointly represent body pose, shape and clothing geometry. In contrast to existing learning-based approaches that require training specific models for each type of garment, SMPLicit can represent in a unified manner different garment topologies (e.g. from sleeveless tops to hoodies and to open jackets), while controlling other properties like the garment size or tightness/looseness. We show our model to be applicable to a large variety of garments including T-shirts, hoodies, jackets, shorts, pants, skirts, shoes and even hair. The representation flexibility of SMPLicit builds upon an implicit model conditioned with the SMPL human body parameters and a learnable latent space which is semantically interpretable and aligned with the clothing attributes. The proposed model is fully differentiable, allowing for its use into larger end-to-end trainable systems. In the experimental section, we demonstrate SMPLicit can be readily used for fitting 3D scans and for 3D reconstruction in images of dressed people. In both cases we are able to go beyond state of the art, by retrieving complex garment geometries, handling situations with multiple clothing layers and providing a tool for easy outfit editing. To stimulate further research in this direction, we will make our code and model publicly available at http://www.iri.upc.edu/people/ecorona/smplicit/.Comment: Accepted at CVPR 202

arXiv.org e-Print Archive

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC

Structured 3D Features for Reconstructing Controllable Avatars

Author: Alldieck Thiemo
Bazavan Eduard Gabriel
Corona Enric
Sminchisescu Cristian
Zanfir Andrei
Zanfir Mihai
Publication venue
Publication date: 24/03/2023
Field of study

We introduce Structured 3D Features, a model based on a novel implicit 3D representation that pools pixel-aligned image features onto dense 3D points sampled from a parametric, statistical human mesh surface. The 3D points have associated semantics and can move freely in 3D space. This allows for optimal coverage of the person of interest, beyond just the body shape, which in turn, additionally helps modeling accessories, hair, and loose clothing. Owing to this, we present a complete 3D transformer-based attention framework which, given a single image of a person in an unconstrained pose, generates an animatable 3D reconstruction with albedo and illumination decomposition, as a result of a single end-to-end model, trained semi-supervised, and with no additional postprocessing. We show that our S3F model surpasses the previous state-of-the-art on various tasks, including monocular 3D reconstruction, as well as albedo and shading estimation. Moreover, we show that the proposed methodology allows novel view synthesis, relighting, and re-posing the reconstruction, and can naturally be extended to handle multiple input images (e.g. different views of a person, or the same view, in different poses, in video). Finally, we demonstrate the editing capabilities of our model for 3D virtual try-on applications.Comment: Accepted at CVPR 2023. Project page: https://enriccorona.github.io/s3f/, Video: https://www.youtube.com/watch?v=mcZGcQ6L-2

arXiv.org e-Print Archive

Enhancing egocentric 3D pose estimation with third person views

Author: Corona Puyane Enric
Dhamanaskar Ameya
Dimiccoli Mariella
Moreno-Noguer Francesc
Pumarola Peris Albert
Publication venue: Elsevier
Publication date: 01/06/2023
Field of study

© 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-NDWe propose a novel approach to enhance the 3D body pose estimation of a person computed from videos captured from a single wearable camera. The main technical contribution consists of leveraging high-level features linking first- and third-views in a joint embedding space. To learn such embedding space we introduce First2Third-Pose, a new paired synchronized dataset of nearly 2000 videos depicting human activities captured from both first- and third-view perspectives. We explicitly consider spatial- and motion-domain features, combined using a semi-Siamese architecture trained in a self-supervised fashion. Experimental results demonstrate that the joint multi-view embedded space learned with our dataset is useful to extract discriminatory features from arbitrary single-view egocentric videos, with no need to perform any sort of domain adaptation or knowledge of camera parameters. An extensive evalu- ation demonstrates that we achieve significant improvement in egocentric 3D body pose estimation per- formance on two unconstrained datasets, over three supervised state-of-the-art approaches. The collected dataset and pre-trained model are available for research purposes.This work has been partially supported by projects PID2020-120 049RB-I00 and PID2019-110977GA-I00 funded by MCIN/ AEI/10.13039/501100 011033 and by the “European Union NextGener-ationEU/PRTR”, as well as by grant RYC-2017-22563 funded by MCIN/ AEI /10.13039/501100 011033 and by “ESF Investing in your future”, and network RED2018-102511-T funded by MCIN/ AEIPeer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC

SMPLicit: Topology-aware generative model for clothed people

Author: Alenyà Ribas Guillem
Corona Puyane Enric
Moreno-Noguer Francesc
Pons-Moll Gerard
Pumarola Peris Albert
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting /republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksIn this paper we introduce SMPLicit, a novel generative model to jointly represent body pose, shape and clothing geometry. In contrast to existing learning-based approaches that require training specific models for each type of garment, SMPLicit can represent in a unified manner different garment topologies (eg from sleeveless tops to hoodies and to open jackets), while controlling other properties like the garment size or tightness/looseness. We show our model to be applicable to a large variety of garments including T-shirts, hoodies, jackets, shorts, pants, skirts, shoes and even hair. The representation flexibility of SMPLicit builds upon an implicit model conditioned with the SMPL human body parameters and a learnable latent space which is semantically interpretable and aligned with the clothing attributes. The proposed model is fully differentiable, allowing for its use into larger end-to-end trainable systems. In the experimental section, we demonstrate SMPLicit can be readily used for fitting 3D scans and for 3D reconstruction in images of dressed people. In both cases we are able to go beyond state of the art, by retrieving complex garment geometries, handling situations with multiple clothing layers and providing a tool for easy outfit editing. To stimulate further research in this direction, we will make our code and model publicly available at http://www.iri.upc.edu/people/ecorona/smplicit/.his work is supported in part by an Amazon Research Award and by the Spanish government with the projects HuMoUR TIN2017-90086-R and María de Maeztu Seal of Excellence MDM2016-0656. Gerard Pons-Moll is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - 409792180 (Emmy Noether Programme, project: Real Virtual Humans)Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC